Characterization and modeling of multicast communication in cache-coherent manycore processors

نویسندگان

  • Sergi Abadal
  • Raul Martinez
  • Josep Solé-Pareta
  • Eduard Alarcón
  • Albert Cabellos-Aparicio
چکیده

The scalability of Network-on-Chip (NoC) designs has become a rising concern as we enter the manycore era. Multicast support represents a particular yet relevant case within this context, mainly due to the poor performance of NoCs in the presence of this type of traffic. Multicast techniques are typically evaluated using synthetic traffic or within a full system, which is either simplistic or costly, given the lack of realistic traffic models that distinguish between unicast and multicast flows. To bridge this gap, this paper presents a trace-based multicast traffic characterization, which explores the scaling trends of aspects such as the multicast intensity or the spatiotemporal injection distribution for different coherence schemes. This analysis is the basis upon which the concept of multicast source prediction is proposed, and upon which a multicast traffic model is built. Both aspects pave the way for the development and accurate evaluation of advanced NoCs in the context of manycore computing.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cache-aware Parallel Programming for Manycore Processors

With rapidly evolving technology, multicore and manycore processors have emerged as promising architectures to benefit from increasing transistor numbers. The transition towards these parallel architectures makes today an exciting time to investigate challenges in parallel computing. The TILEPro64 is a manycore accelerator, composed of 64 tiles interconnected via multiple 8×8 mesh networks. It ...

متن کامل

Efficient Communication and Synchronization on Manycore Processors

The increased number of cores integrated on a chip has brought about a number of challenges. Concerns about the scalability of cache coherence protocols have urged both researchers and practitioners to explore alternative programming models, where cache coherence is not a given. Message passing, traditionally used in distributed systems, has surfaced as an appealing alternative to shared memory...

متن کامل

Raexplore: Enabling Rapid, Automated Architecture Exploration for Full Applications

We present Raexplore, a performance modeling framework for architecture exploration. Raexplore enables rapid, automated, and systematic search of architecture design space by combining hardware counter-based performance characterization and analytical performance modeling. We demonstrate Raexplore for two recent manycore processors IBM BlueGene/Q compute chip and Intel Xeon Phi, targeting a set...

متن کامل

LightScan: Faster Scan Primitive on CUDA Compatible Manycore Processors

Scan (or prefix sum) is a fundamental and widely used primitive in parallel computing. In this paper, we present LightScan, a faster parallel scan primitive for CUDA-enabled GPUs, which investigates a hybrid model combining intrablock computation and inter-block communication to perform a scan. Our algorithm employs warp shuffle functions to implement fast intra-block computation and takes adva...

متن کامل

An overview about Networks-on-Chip with multicast suppor

Modern System-on-Chip (SoC) platforms typically consist of multiple processors and a communication interconnect between them. Network-on-Chip (NoC) arises as a solution to interconnect these systems, which provides a scalable, reusable, and an efficient interconnect. For these SoC platforms, multicast communication is significantly used for parallel applications. Cache coherency in distributed ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Computers & Electrical Engineering

دوره 51  شماره 

صفحات  -

تاریخ انتشار 2016